A New Approach to Fitting Linear Models in High Dimensional Spaces
نویسنده
چکیده
This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In this sense, pace regression outperforms existing procedures for fitting linear models. Dimensionality determination, a special case of fitting linear models, turns out to be a natural by-product. A range of simulation studies are conducted; the results support the theoretical analysis. Through the thesis, a deeper understanding is gained of the problem of fitting linear models. Many key issues are discussed. Existing procedures, namely OLS, AIC, BIC, RIC, CIC, CV(d), BS(m), RIDGE, NN-GAROTTE and LASSO, are reviewed and compared, both theoretically and empirically, with the new methods. Estimating a mixing distribution is an indispensable part of pace regression. A measure-based minimum distance approach, including probability measures and nonnegative measures, is proposed, and strongly consistent estimators are produced. Of all minimum distance methods for estimating a mixing distribution, only the nonnegative-measure-based one solves the minority cluster problem, what is vital for pace regression. Pace regression has striking advantages over existing techniques for fitting linear models. It also has more general implications for empirical modeling, which are discussed in the thesis. iii
منابع مشابه
New Approach in Fitting Linear Regression Models with the Aim of Improving Accuracy and Power
The main contribution of this work lies in challenging the common practice of inferential statistics in the realm of simple linear regression for attaining a higher degree of accuracy when multiple observations are available, at least, at one level of the regressor variable. We derive sufficient conditions under which one can improve the accuracy of the interval estimations at quite affordable ...
متن کاملA New High-order Takagi-Sugeno Fuzzy Model Based on Deformed Linear Models
Amongst possible choices for identifying complicated processes for prediction, simulation, and approximation applications, high-order Takagi-Sugeno (TS) fuzzy models are fitting tools. Although they can construct models with rather high complexity, they are not as interpretable as first-order TS fuzzy models. In this paper, we first propose to use Deformed Linear Models (DLMs) in consequence pa...
متن کاملFitting Second-order Models to Mixed Two-level and Four-level Factorial Designs: Is There an Easier Procedure?
Fitting response surface models is usually carried out using statistical packages to solve complicated equations in order to produce the estimates of the model coefficients. This paper proposes a new procedure for fitting response surface models to mixed two-level and four-level factorial designs. New and easier formulae are suggested to calculate the linear, quadratic and the interaction coeff...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملA Numerical Approach for Solving of Two-Dimensional Linear Fredholm Integral Equations with Boubaker Polynomial Bases
In this paper, a new collocation method, which is based on Boubaker polynomials, is introduced for the approximate solutions of a class of two-dimensional linear Fredholm integral equationsof the second kind. The properties of two-dimensional Boubaker functions are presented. The fundamental matrices of integration with the collocation points are utilized to reduce the solution of the integral ...
متن کامل